Picture for Haichang Gao

Haichang Gao

Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models

Add code
May 19, 2026
Viaarxiv icon

Guaranteed Jailbreaking Defense via Disrupt-and-Rectify Smoothing

Add code
May 11, 2026
Viaarxiv icon

Re-Triggering Safeguards within LLMs for Jailbreak Detection

Add code
May 11, 2026
Viaarxiv icon

ICU-Bench:Benchmarking Continual Unlearning in Multimodal Large Language Models

Add code
May 07, 2026
Viaarxiv icon

Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning

Add code
May 07, 2026
Viaarxiv icon

A Systematic Security Evaluation of OpenClaw and Its Variants

Add code
Apr 03, 2026
Viaarxiv icon

From Assistant to Double Agent: Formalizing and Benchmarking Attacks on OpenClaw for Personalized Local AI Agent

Add code
Feb 09, 2026
Viaarxiv icon

MLLM Machine Unlearning via Visual Knowledge Distillation

Add code
Dec 12, 2025
Viaarxiv icon

Whispers of Data: Unveiling Label Distributions in Federated Learning Through Virtual Client Simulation

Add code
Apr 30, 2025
Figure 1 for Whispers of Data: Unveiling Label Distributions in Federated Learning Through Virtual Client Simulation
Figure 2 for Whispers of Data: Unveiling Label Distributions in Federated Learning Through Virtual Client Simulation
Figure 3 for Whispers of Data: Unveiling Label Distributions in Federated Learning Through Virtual Client Simulation
Figure 4 for Whispers of Data: Unveiling Label Distributions in Federated Learning Through Virtual Client Simulation
Viaarxiv icon

HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor

Add code
Jan 23, 2025
Figure 1 for HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Figure 2 for HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Figure 3 for HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Figure 4 for HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor
Viaarxiv icon